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Abstract 

Two separate statistical tests are applied to the AGASA and preliminary Auger 
Cosmic Ray Energy spectra in an attempt to find deviation from a pure power-law. 
The first test is constructed from the probability distribution for the maximum event 
of a sample drawn from a power-law. The second employs the TP-statistic, a function 
defined to deviate from zero when the sample deviates from the power-law form, 
regardless of the value of the power index. The AGASA data show no significant 
deviation from a power-law when subjected to both tests. Applying these tests to 
the Auger spectrum suggests deviation from a power-law. However, potentially large 
systematics on the relative energy scale prevent us from drawing definite conclusions 
at this time. 
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1 Introduction 



Nature offers a wide range of phenomena characterized by power-law distri- 
butions: diameter of moon craters, intensity of solar flares, the wealth of the 
richest peoplefl] and intensity of terrorist attacks[2], to name a few. These dis- 
tributions are so-called heavy-tailed, where the fractional area under the tail of 
the distribution is larger than that of a gaussian and there is thus more chance 
for samples drawn from these distributions to contain large fluctuations from 
the mean. Anatomical 2 defects aside, the cosmic ray (CR) energy spectrum 
follows a power-law for over ten orders of magnitude. The predicted abrupt 

1 Corresponding author, E-mail: jhague@unm.edu 

2 Well known small deviations from a pure power-law are dubbed "The Knee" and 
"The Ankle." 
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deviation at the very highest energies (the GZK-cutoff[3,4]) has generated a 
fury of theoretical and experimental work in the past half century. Recently, 
Bahcall[5] and Waxman (2003) have asserted that the observed spectra (except 
AGASA) are consistent with the expected flux suppression above 5 x 10 19 eV. 
However, the incredibly low fluxes combined with as much as ~50% uncer- 
tainty in the absolute energy determination means that there has yet to be a 
complete consensus on the existence of the GZK-cutoff energy. 

With this in mind, we consider statistics which suggest an answer to a different 
question: Do the observed CR spectra follow a power-law? Specifically, these 
studies are designed to inquire whether or not there is a flux deviation relative 
to the power-law form by seeking to minimize the influence of the underlying 
parameters. 

The two experimental data sets considered in this study are the AGASA [8] ex- 
periment and the preliminary flux result of the Pierre Auger Observatory [6, 7]. 
The discussion in §2 uses these spectra to introduce and comment on the 
power-law form. The first distinct statistical test is applied to this data in 
§3 where we explore the distribution of the largest value of a sample drawn 
from a power-law. In §4 we apply the TP-statistic to the CR flux data. This 
statistic is asymptotically zero for pure power-law samples regardless of the 
value power index and therefore offers a (nearly) parameter free method of 
determining deviation from the power-law form. The final section summarizes 
our results. 



2 The Data 



A random variable X is said to follow a power-law distribution if the probabil- 
ity of observing a value between x and x + dx is f(x)dx where f(x) = Cx~~< ' . 
Normalizing this function such that f(x)dx = 1 gives, 

fx(x) 



3jvn.in. V 



mm 



It is convenient to choose z = x/x m i n =>• dz = dx/x m i n , 1 < z < oo and doing 
so yields 

f z (z) = ( 7 - l)z-\ (2) 
For reference, one minus the cumulative distribution function Fz(z) is given 
by, 

1 - F z (z) = / fz(y)dy = z'-\ (3) 
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Taking the log of both sides of equation (1) yields 



log / (x) = log A — 7 log x, 



(4) 



where A is an overall normalization parameter, and suggests a method of esti- 
mating 7; the power index is the slope of the best fit line to the logarithmically 
binned data (i.e. bin-centers with equally spaced logarithms). In what follows, 
we refer to the logarithmically binned estimate [9] of the power index as 7 and 
assume that the typical x 2 /NDF is indicative of the goodness of fit. The fitting 
is done with two free parameters, namely A and 7. 
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Fig. 1. This figure displays published AGASA [8] and Auger [6] CR energy spectra. 
Both axis have logarithmic scales to illustrate the power-law behavior. The vertical 
axis is the flux J in (m 2 sr sec eV) -1 and the horizontal axis is the energy in eV. 



The best fit lines (see 4) have slope AGASA 
(statistical error only). 



2.80 ± 0.23 and 7 Auger = 2.97 ± 0.12 



The energy flux of two publicly available data sets are shown in Fig. 1. The 
the red point-down triangles represent the logio of the binned AGASA flux 
values in units of (m 2 sr sec eV) -1 and the blue point-up triangles correspond 
to the Auger flux. The vertical error bars on each bin reflect the Poisson error 
based on the number of events in that bin. The log-binned estimates for each 
complete CR data set are the slopes of the dashed lines plotted in Fig. 1. 

In order to check the stability of to bound on our estimate, we compute the 
estimated power index 7 as a function of the minimum energy E m i n considered 
for each of the two CR data sets. The left-most blue (red) point in Fig. 2 shows 
7 for the Auger (AGASA) data taking into account all of the bin values above 
\ogE min = 18.5 ()ogE min = 18.8), the next point to the right represents 
that for all bins above \ogE min = 18.6 (\ogE min = 18.9), and so on. The 
vertical error bars on these points represent the la^ error of the estimate. To 
ensure an acceptable chi-squared statistic, we demand that at least five bins 
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AG ASA 
Auger 



Fig. 2. To check the stability of 7 we estimate the power index as a function of 
the minimum energy E m i n considered for the AGASA and Auger CR data sets; see 
Fig.l. The left most point is the slope of the best fit lines plotted in Fig.l. The 
vertical error bars represent la^ deviation. 

be considered, thereby truncating E min at \ogE min = 19.4 for the Auger and 
\ogE min = 19.7 for the AGASA data set. The xV NDF for the left-most points 
is ~ 0.3 and it increases to ~ 2.5 for the right-most for both experiments. We 
note that these estimates do not vary widely for the lowest E mi nS and that 
the values of 7 from these experiments are consistent. 

The analyses discussed in §3 and §4 will depend on the total number of 
events in the data set. Since these numbers are not published we use a sim- 
ple method for estimating them from the CR flux data. If the exposure is 
a constant function of the energy, then we may take the flux J to be pro- 
portional to the number of events in the bin and the exposure rj, namely 
N = Jr]E bin _ center \n(10)/10. The Auger exposure is reported to be constant 
over the energy range reported with 77 Auger — 5.5 x 10 16 (m 2 sr sec). The 
AGASA collaboration report flux data all the way down to log E min = 18.5 but 
the exposure of the experiment can be considered approximately constant only 
for energies above log E min = 18.8 (see Fig. 14 of [8]) where t/agasa = 5.1 x 10 16 
(m 2 sr sec). Using this method we get a total of 3567 events with E > 10 18,5 
for the Auger flux and 1914 with E > 10 18 8 for the AGASA experiment. 



3 The Distribution of the Largest Value 



As evidence suggestive of a GZK-cutoff, an often cited quantity is the flux 
suppression, or the ratio of the flux one would expect from a power-law to 
that actually observed above a given maximum, say, z max . Since J oc N one 
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may estimate the flux suppression by estimating the number of events N sup 
out of N tot expected above a given maximum as N sup = N tot [l — F z (z max )] = 
Ntot z m~ax- Thus, the bin with minimum = z max and maximum — > oo would have 
a height = N sup if the data continued to follow a power-law above z max . As 
a test statistic for this quantity, one may consider the Poissonian probability 
that the bin height could statistically fluctuate to zero, namely V(0, N sup ) = 
exp[-N to tZ^ 



, " 7 

*max\ 



In this section we derive a similar test statistic based on the distribution of 
the maximum event from a power-law sample. The statistic discussed here 
approaches V(0, N sup ) for large N tot and allows us to show that the estimation 
errors associated with 7 are enough to disallow any significant conclusion 
about the presence of flux suppression for the highest energy CR's. 

The form of the power-law distribution allows us to calculate the pdf of the 
largest value, X max , out of N events. Using the equations (1) and (3) we can 
say that the probability that any one value falls between x and x + dx and 
that all of the others are less than it is f(x)dx x F(x) N ~ 1 . There are N ways 
to choose this event and so the probability for the largest value to be between 
x and x + dx is 

n(x)dx = Nf(x)F(x) N - l dx. 
In terms of the ratio z, this can be written as 

7r(z)dz = iV( 7 - \)z'^ (l - z 1 -^' 1 dz. (5) 

Fig. 3 contains a plot of this distribution for 7 = 3.0 with three choices of N. 
The glaring implication of this plot is that even for "small" N nearly all of 
the integral of ir(z) is above z ~ 10. This implies that the probability of the 
maximum energy event falling below 10 times the minimum is very small, for 
a power-law with these parameters. 

Motivated by the location and shape of tt(z) we consider the probability P 
that the maximum ratio from a given sample Z max is less than or equal to a 
particular value 3 , say z, in a convenient form as 

/ z r 1 N 

n(t)dt = [1 - z 1 ^] . (6) 

Indeed, with 7 = 3.0 (as in Fig. 3), P{Z max < 10) = 6.6 x 10" 3 for N = 500, 
4.3 x 10~ 5 for N = 1000 and is 1.4 x 10~ 23 for N = 5000. Another way to say 
this is that if one were to generate 10 5 sets of events, each containing 1000 
events drawn from a pure power-law with 7 = 3.0, ~ 99.99% of these sets 



3 For large Ntot, equation (6) approaches the Poisson probability mentioned above; 

N tot 



1 - z]nax -> exp[-N to tzLax] = V{0, N sup ). 
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Fig. 3. A plot of the probability distribution of the maximum of a sample drawn 
from a power-law with power index 7 = 3.0. This is the distribution tt(z) defined 
in equation (5) where z is the ratio of the maximum to the minimum. The sample 
sizes are N = 500, 1000 and 5000. 

would have a maximum element with a value greater than 10 times the mini- 
mum. For 500 events/set the fraction decreases to ~ 99.34%. Such simulations 
were carried out in preparation for this note and the results were consistent 
with equation (6). 

To apply this idea to the CR spectrum we consider the following null hypoth- 
esis: The flux of CR 's follow a power-law with index 7 for all energies greater 
than a given minimum. As a test statistic for this hypothesis we use P, as 
denned in equation (6), with the interpretation that if the null hypothesis 
is true then P is the probability that the ratio of the maximum energy to 
the minimum is less than or equal to the observed ratio. Typically, the null 
hypothesis is rejected at the 5% significance level (S.L.) if P < 0.05. 

To calculate the value of P for the observed data sets we need three pieces of 
information: the ratio of the maximum observed value to the minimum z^ x , 
the number of events N to t with values in the interval [1, z max ] and a reasonable 
guess for the power index 7. Since a larger z max will lead to a larger value of P 
we will conservatively take the highest energy AGASA (resp. Auger) event to 
fall on the upper edge of the highest energy bin. The method of determining 
the number of events in each bin is described in §2 and here the parameter 
N tot represents the total number above a given minimum. We will use the 
logarithmically binned estimates and errors of 7 discussed in §2. 

The plot in Fig. 4 shows P(Z max < z^ x ) given N tot and 7 as a function 
of minimum energy considered for each of the CR data sets in Fig. 2. In 
particular, for each E min the values of N tot , z ^ s ax and 7±cr^ are estimated from 
the CR flux and the resulting P are plotted for the Auger (blue) and AGASA 
(red) data. For example, the left-most Auger point represents the probability 
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Fig. 4. A plot of the probability that the maximum of a sample drawn from 
a power-law will be less than or equal to the maximum observed by the Auger 
{E m ax = 10 20 , blue point-up) and AGASA (E max = 10 204 , red point-down) ex- 
periments as a function of the minimum energy considered. The vertical error bars 
represent the effect of a lax, deviation and the hatched area shows the 5% signifi- 
cance level. 

that if N to t = 3567 events are drawn from a power-law with 7 = 2.97ioj2 then 
there is a 1.9 jzf; 6 % chance that the maximum log-ratio \ogz max would be less 
than or equal to that reported by the Auger experiment, namely log = 
log 10 20 /10 18 ' 5 (eV). Taken at face value, one may reject the null hypothesis 
at the 5% S.L. for this data set. The left-most AGASA point represents the 
same probability for the complete set of AGASA data, namely P(\ogZ max < 
log 1.6) = 8.4^g 3 5 % for N tot = 1914 events drawn from a power-law with 
7 = 2.80ig 1 3. Thus we cannot reject the null hypothesis for the AGASA data. 

The upper (lower) vertical error bars depicted in Fig. 4 represent the value 
of P if we have under (over) estimated the power index by c^, that is if 
7 = 7 ± crtf, keeping the log-ratio and the total number of events constant. 
(The possible errors in the total number of events are on the order of a few 
percent and are negligible.) Since the fitting scheme considers successively 
lower energy bins, the points (and errors) for each experiment plotted in Fig. 
4 are highly correlated. The upper error bars fall above the 5% S.L. for all 
minimums considered and therefore the statistical error associated with 7 is 
enough to disallow rejection of the power-law hypothesis. 

The biggest systematic measurement uncertainty in the CR data is the calibra- 
tion of the energy. This uncertainty leads to an error in the reported absolute 
energy values of ~ 30% for the AGASA [8] data and as much as ~50% for the 
highest energy events in the Auger data set. Since the probability considered 
here depends only on the ratio of the observed energies, it is independent of 
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any constant systematic uncertainty in the energy determination. However, 
this probability is sensitive to energy errors which vary over the range consid- 
ered and will thus cause uncertainty in z°^ ax . 

For example, if we take the maximum to be 50% higher (but hold 7 = 2.97 
and N tot = 3567 constant) the value of P represented by the left most Auger 
point in Fig. 4 changes from 1.9% to 17%. Thus the large uncertainty in 
z max combined with the errors associated with 7 implies that the preliminary 
Auger data set does not suggest sufficient evidence to reject the pure power- 
law hypothesis for all events above E min = 10 18 - 5 (eV). 



4 The TP-Statistic 



Considering the error and extra degree of freedom associated with 7, an anal- 
ysis of a distribution's adherence to the power-law form without reference to, 
or regard for, this parameter is could lead to enhanced statistical power. First 
proposed by V. Pisarenko and D. Sornette 4 , the so-called TP-statistic[l2, 13] 
is a function of random variables that (in the limit of large N) tends to zero 
for samples drawn from a power-law, regardless of the value of 7. (TP stands 
for tail power, as oppossed to TE, also introduced in [12,13], which stands for 
tail exponential.) This section will describe the TP-statistic and apply it to 
the CR data. 

The raw moments of the pdf equation (1) are [1] 

Ji —h. m < 7. 

^ 7— 1 — TO ' 

Thus power-laws with 7 < 3 have a finite mean but an infinite variance (in 
the limit of large N) and sample statistics created from these moments are 
not particularly helpful. However, taking the natural logarithm of z allows the 
integrals to converge and one may write (for all 7 > 1 and m — 0, 1, 2, . . .), 

, m = (\n-z) z = J -^. (8) 
The TP-statistic is calculated by noting that v\ — v 2 /2 — 0. Therefore, if we 



4 They studied earthquake and financial return data. 
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use the sample analog of these quantities, namely 



iv • L min 

then we can define (for all > u), 

(\ N x\ 2 1 N x- 

By the law of large numbers this sample statistic tends to zero as n — > oo, 
independent of the value of 7. The TP-statistic allows us to test for a power- 
law like distribution without comment about the value of the power index. 
Furthermore, for any one sample we can vary u from the sample minimum 
X m in to the sample maximum X max and calculate the TP-statistic over the 
range of x in the sample. 

Given complete event lists one may use equation (10) to calculate the TP- 
statistic for the unbinned data. Since only the binned CR flux is publicly 
available we adapt the statistic to a binned analysis and apply it first to an 
example distribution with a cutoff and then to the CR data sets. 



4-1 An Example 



In order to build intuition about the TP-statistic and its variance before study- 
ing the CR data, we first apply this statistic to simulated event sets drawn 
from both a pure power-law distribution and a similar distribution with a 
cut-off. The cut-off pdf is chosen so that it mimics a power-law for the lowest 
values but has an abrupt (and smooth) cut-off at a particular value, say x cut . 
The functional form we will use here is 

x -i 

g(x) = B^,x min ,x cut ) ex _ xcut + i . (11) 

The normalization of this pdf is -8(7, x min , x cut ), the value of which must be 
computed numerically. Fig. 5 contains a logarithmically binned histogram of 
3000 events drawn from a pure power-law (black circles) with x m i n = 1.0 and 
7 = 3.0, and two pdf's in the from of equation (11); the magenta squares have 
loga; cni = 1.0 and the green triangles have \ogx cut = 1.5. While arbitrary, the 
values of these parameters are chosen to be similar to the AGASA and Auger 
data (see Fig.l). 

If we write the sorted (from least to greatest) values from a sample as {X^,X^ 2 ), 
. . . , X(n)}, the solid black line in Fig. 6 is created by calculating TP(u = Xq)) 
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Fig. 5. Logarithmically binned histogram of 3000 events drawn from a pure 
power-law with 7 = 3.0 and two power-laws with a cut, see equation (11). The 
magenta squares are drawn from the distribution with logx cu t = 1.0 and the green 
triangles have log x cu t = 1.5. As noted in the text, while arbitrary, the values of these 
parameters are chosen to be similar to the AGASA and Auger data (see Fig.l). 

for each value of the 3000 events drawn from the pure power-law histogram in 
Fig. 5. The circles represent the mean of the the statistic within the i th bin, 
say TPi, and the vertical error bars show the root-mean-squared deviation of 
the statistic within the bin. Note that the total number of events considered 
by the statistic decreases quickly from left to right which leads to a bias in 
and an increasing variance of the statistic. 

The jagged magenta line Fig. 6 shows the most obvious deviation from the 
power-law form; it is systematically offset from zero for nearly all minima 
of the data set. Of course, with 3000 events the histograms (see Fig. 5) are 
enough to distinguish between these two distributions. But the TP-statistic 
allows us to see this deviation by considering the entire data set (the left most 
magenta point in Fig. 6), not just by analyzing the events in the upper most 
bins. The green line in the figure shows TP(u) for events drawn from equation 
(11) with log x cut = 1.5. The histogram for this set is not as clearly different 
from the power-law as the magenta points and neither is the TP-statistic; the 
left-most green point shows no more deviation from zero than the power-law. 
However, as the minimum increases (and nears x cu t) the statistic moves away 
from zero (more noise not withstanding) and suggests that the data above the 
minimum deviate from the power-law. 

It is important to note that the TP-statistic is positive for both of the cutoff 
distributions. Recall that for a pure power-law, v\ — u 2 /2 = 0. The cutoff 
distribution, however, lacks an extended tail and will therefore have a smaller 
second log-moment v 2 as compared with (the square of) the first log-moment v\ 
and will thus result in a positive TP-statistic. A distribution with an enhance- 
ment, rather than a cutoff, in the tail would result in a negative TP-statistic, 
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Fig. 6. The TP-statistics, defined in equation equation (10), as a function of min- 
imum value "it" for the 3 sets of 3000 events plotted in Fig. 5. Also plotted is the 
mean of the TP-statistic within each of the logarithmically spaced bins which is 
referred to in the text as TP. The vertical error bars represent the RMS devia- 
tion of the statistic within each bin. Parenthetically, with increased statistics, say 
10,000 events, the distinct characteristics of the TP-statistic for a pure power-law, 
a power-law with a cutoff \ogx cu t = 1.0 or a power-law with a cutoff \ogx cu t = 1.5 
become more clearly different. 

since it would have a larger second log-moment (i.e. a larger "variance"). See 
the Appendix (§6) for a detailed discussion of the TP-statistic applied to the 
double power-law. 

To quantify the significance of the TP-statistics' deviation from zero, 10 4 sets 
of 3000 events were generated for each of the three distributions discussed in 
this section. For each set we calculate the mean TP-statistic TP within each 
of the logarithmically spaced bins. The resulting distribution of TP's within 
each bin is then fitted to a gaussian. 

The black circles in Fig. 7 represent the mean of the gaussian fit to the dis- 
tribution of TP's within each bin for a power-law and the error bars on the 
points represent the fitted lex deviation of the TP's. We interpret the left-most 
of these points in the following way: for 3000 events drawn from a power-law 
the "expected value" of TP in the first bin is effectively indistinguishable from 
zero, as expected. 

Though the statistic itself does not depend on 7, the variance on this value 
does. The reason for this is that the variance of the TP's depends on the aver- 
age total number of events greater than a given minimum, which is influenced 
by 7. In this case the total number of events per set for minima in the first 
bin is at least a few thousand and the variance of the TP's is <jjrp ~ 0.005. 
These errors increase from left to right since each successively higher bin will 
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Fig. 7. The fitted mean and la deviation of the TP's (see definition in text) within 
each bin for the three distributions described in the text. This plot is the result 
of 10 4 simulated sets of events, where Fig. 6 is one example, and where each set 
contains 3000 events. 

contain TP's based on fewer and fewer events. 



The magenta squares represent the fitted mean TP as a function of x m i n for 
sets drawn from a power-law with a cut-off at \ogx cut = 1.0. They deviate 
from zero for all but the largest x m i n . Furthermore, this offset is statistically 
significant for the lowest few bins of x m i n , where the statistic reflects the 
deviation from power-law considering most of the events in the set. The green 
triangles show the fitted means for the \ogx cu t = 1.5 distribution. They also 
display some deviation from zero, but they are not as significant since they 
fall near the la errors for the pure power-law distribution. 

Indeed, one may inquire as to which of the bins deviate the most from the 
simulated power-law. This is equivalent to asking, "above what minimum do 
the data generated from this cut-off distribution maximally deviate from a 
pure power-law?" To quantify this deviation, we use a P-value given by 

Ptp = 1 - 4= f e ~ t2/2dt > ( 12 ) 

V27T J-P 

where 

-3=^=4. (13) 

V cr i + cr 2 

fa is the mean of the fitted gaussian and <7j is the standard deviation. We 
reject the pure power-law hypothesis (at the 5% S. L.) if P TP < 0.05. The 
mean of the gaussian fit to the distribution of TP's for the power-law in 
the bin with minimum logx m j„ = 0.0 is fix = (0.0056 ± 5.1) x 10~ 3 with a 
standard deviation o\ = 4.9 x 10~ 3 . The mean of the fitted gaussian for the 
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Fig. 8. The distribution of the TP's in the first bin of Fig. 7 (the bin with min- 
imum \ogx m i n = 0.0) for the simulated pure power-law (black, shaded) and a 
power-law with a cut-off at \ogx cu t = 1.0 (magenta, hatched). For these distri- 
butions P T p = 0.00218 (see equation (12)). 



log x cu t = 1.0 distribution in this bin is = (1.7±0.28) x 10~ 2 with a standard 
deviation o"2 = 2.8 x 10 -3 . Therefore, the significance level of the deviation 
is Ptp = 2.18 x 10~ 3 and we can reject the pure power-law hypothesis for 
this distribution. The distribution of the TP's for this bin is plotted in Fig. 
8 for the pure power-law (black, shaded) and the \ogx cut = 1.0 (magenta, 
hatched) pdf. The maximum deviation for the logx cut = 1.5 pdf occurs in the 
bin with minimum logx m j n = 0.4 and the corresponding distributions of TP 
are plotted in Fig. 9. The significance of this deviation is lower; Ptp = 0.298. 
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Fig. 9. The distribution of the TP's in the fifth bin of Fig. 7 (bin with minimum 
log %min — 0.4) for the simulated pure power-law (black, shaded) and a power-law 
with a cut-off at log x cu t = 1.5 (green, hatched). For these distributions Ptp = 0.298 
(see equation (12)). 
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4-2 The Cosmic Ray Data 



In order to apply the TP-statistic to the CR data, Monte-Carlo simulations 
were conducted and analyzed in a manner similar to that discussed in §4.1; 
we generate 10 4 sets of events from the reported flux and the resulting distri- 
bution of TP (within each bin) is fitted to a gaussian. Since the significance 
of deviation from zero depends on both the power index and the number of 
events, we will compare each of the Auger and AGASA data sets with a unique 
power-law. We will take the AGASA experiment to have 1916 events above 
log E m i n = 18.8 and we will compare the resulting TP-statistics with those of 
a power-law with the same minimum and 7 = 2.80. The Auger spectrum has a 
power-index estimate of 2.97 considering all of the data above \ogE min = 18.5 
and a total of 3570 events, so we will therefore compare the TP-statistics aris- 
ing from the Auger flux to those of a pure power-law with these parameters. 

The application of this scheme to the AGASA spectrum is plotted in Fig. 
10 in red triangles. The black circles represent average TP-statistic value for 
data drawn from a pure power-law with Jagasa- Both plots have N = 886 
events per sky. The error bars on each point represent the 1-sigma deviation 
of the gaussian fit to the distribution of the mean TP-statistic. Since the 
AGASA values do not significantly deviate from zero (or the power-law values) 
this plot suggests that the AGASA distribution does not significantly deviate 
from a pure power-law. The most significant deviation occurs in the bin with 
minimum 10 19,2 (eV) and gives Ptp = 0.161, which is consistent with the P- 
value for this bin discussed in §3. These distributions are plotted in Fig 11. 



£S 0.02 



- pure power-law 
AGASA 



log E (eV) 

min 



Fig. 10. The fitted mean and la deviation of the TP's (see definition on the text) 
within each bin for the AGASA spectrum (red triangles) and a pure power-law 
distribution (black circles). This plot is the result of 10 4 simulated sets of events 
where each set contains 1916 events and the power-law has index 7 = jag AS A ■ 
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TP(10 192 ) 

Fig. 11. The distribution of TP's in the fifth bin of Fig. 10 (the bin with mini- 
mum E m i n = 10 19,2 (eV)) for the pure power-law (black, shaded) and the AGASA 
spectrum (red, hatched). For these distribution Ptp = 0.161 (see equation (12)). 

The simulation results from the Auger spectrum are plotted in Fig. 12. This 
plot shows deviation from a power-law for the lowest minimums considered. 
For the bin with minimum log E m % n = 18.6 we find Ptp = 1.54 x 10~ 4 . Thus 
we can say that the Auger spectrum with energies greater than 10 18,6 (eV) 
deviate from a power-law by ~ 3.78a, where a 2 = a\ + erf. The distribution 
of TP's for this minimum energy is plotted in Fig. 13. 
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Fig. 12. The fitted mean and la deviation of the TP's (see definition on the 
text) within each bin for the Auger spectrum (blue triangles) and a pure power-law 
distribution (black circles). This plot is the result of 10 4 simulated sets of events 
where each set contains 3570 events and the power-law has index 7 = 7 Auger- 

Since the TP-statistic nearly eliminates the need to estimate 7, the biggest 
systematic uncertainty in analyzing the CR data with the TP-statistic is likely 
to be errors in the event energies. Similar to the P- value discussed in §3, it is 
only the relative energy errors which can effect the result, since the TP-statistic 
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Fig. 13. The distribution of TP's in the second bin of Fig. 10 (the bin with minimum 
E m in = 10 18,6 (eV)) for the pure power-law (black, shaded) and the Auger spectrum 
(blue, hatched). For these distribution Ptp = 1.54 x 10 -4 (see equation (12)). 

depends only on the ratio. However, any elongation of the observed spectrum 
brought about by this relative uncertainty effect the TP-statistic. Without 
further study of the CR energy systematics, we cannot draw a conclusion 
from the ~ 3.78a deviation in Fig. 13. 



5 Summary 



In §2 we use the reported (AGASA and Auger) CR fluxes to discuss the power- 
law form and illustrate the logarithmically binned estimates of the power index 
7. The probability P that the maximum value of a sample drawn from a power- 
law is less than or equal to a particular value is defined in equation (6). Using 
reasonable estimates for 7, N tot and z^ x from the CR data sets we calculate 
P in §3. The value of P is used to test the null hypothesis that these data sets 
follow a power-law . The AGASA data give no reason to reject the hypothesis; 
Pag as a ~ 8.4% for the data with hgE(eV) > 18.8. The Auger data give 
more reason to reject the null hypothesis, PAuger ~ 1-9% for the data with 
\ogE(eV) > 18.5. However, consideration of the errors on 7 prevent any solid 
conclusion. 

For the purpose of statistical analysis it would be useful to eliminate, or at 
least minimize, the importance of 7. The TP-statistic tends (asymptotically) 
to zero regardless of the value of 7 and is the subject of §4. We apply the 
TP-statistic to the CR data sets using a Monte-Carlo method described in 
§4.2. The AGASA data give a value of Ptp = 0.161 for energies greater than 
10 19,2 (eV). a value consistent with the P- value discussed in §3 (Fig.4). The 
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preliminary Auger flux results in a TP-statistic with more significant deviation 
from the power-law form: P^p = 1.54 x 10~ 4 for E min = 10 18 ' 6 (eV). Comparing 
this value with the P-value for this bin derived in §3, namely P ~ 2 x 10 -2 , 
illustrates the power of the method based on the TP-statistic which is essen- 
tially independent of gamma. Better understanding of the relative errors on 
the CR energies should lead to a definitive conclusion on the question of a 
cut-off in the CR spectrum. 
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6 Appendix 



In §4.1 we state that the TP-statistic will be distinctly positive for distribu- 
tions which contain a tail-suppression and negative for distributions which 
contain a tail-enhancement (relative to the pure power-law form). In this 
section we numerically compute the TP-statistic for a "double power-law" 
distribution and describe the parameter space associated with this statistic. 

Consider the following probability distribution function: 



, I A-i^mini Xbendi 7> S^)X ^ %min — % *C Xbend , , . . 

13\%mini %>bendi 7? 0)X Xbend X <~. OO, 



where A(x min , x bend , 7, 5) and B(x min ,x bend ,-f,5) are chosen such that 

lim /(x) = lim f(x) 

X ~^ X bend X ~* X bend 

and 

I. 



/■oo 

/ /(a;) da; 



This distribution follows a power-law with index 7 for x min < x < Xb en d, and 
5 for X > x bend . 



Given the parameter set {x m i n , Xbend, 7, 5}, we define the TP-statistic for this 
distribution as 



/•oo /t\ 1 ^ 1 Z* 00 /t\ 



l2 

TP{u) 



For -u > Xbend and/or 7 = 5 equation (15) is identically zero since it is equal 
to v\ — l/2z/ 2 (see equation (8)). However, equation (15) is non-trivial when 
Xmin < u < Xbend and 7 ^ 5. In what follows, we calculate TP{u) for x m i„ < 
u < Xbend and various values of Xbend and <5 with x m i n = 1 and 7 = 3 fixed. 

Fig. 14 contains a plot of log f(x) versus log a; with 5 = 7± 1 for several choices 
of logXbend (namely, for \ogXb en d varying from 1 to 2 in steps of 0.2). The red 
curves correspond to 7 < 5 = 4 (tail-suppression) and the blue curves have 
7 > 5 = 2 (tail-enhancement). The TP-statistic for each of these distributions 
is shown in Fig. 15 as a function of u. Examination of Fig. 15 suggests the 
following conclusions for a given 7 and 5: 

• TP{u) is positive for all values of u and Xb en d if and only if 7 < 5, and it is 
negative if and only if 7 > 5. 

• For Xbend much greater than x m i n , TP{u = x m i n ) is approximately zero. 
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Specifically, as x bend /x min -> oo, TP(x min ) -> 0. 

• The location of the maximum deviation, say Uq where 

(±TP W ) =0, (16) 

\ / U=Uo 

is highly correlated with the location of the bend Xb en d- Indeed, we have 
found that there is a linear relationship between logMo and log Xbend and 
that this relationship is independent of whether 7 is less than or greater 
than 5. 

• The maximum deviation of the TP-statistic, i.e. TP(uq), is independent of 
logx bend . 

log f (x) 




logx 



Fig. 14. A plot of log f(x) (see equation (14)) versus log x with 5 = 7 ± 1 for several 
choices of logx bend (namely, for logx bend varying from 1 to 2 in steps of 0.2). The 
red curves correspond to 7 < 5 = 4 (tail-suppression) and the blue curves have 
7 > 5 = 2 (tail-enhancement). The more black the color of the curve, the larger 

logX bend- 
To isolate the effects of power index choice, consider the family of distributions 
where logXbend = 1.0 is fixed but 6 is allowed to vary. Since the integrals in 
equation (15) only converge if 5 > 2, the minimum 5 we can choose is S = 2. 
There is no upper bound on 5 so we vary this parameter over the interval 

2 < 5 < 3 in steps of 0.2 and over the interval 3 < 5 < 10 in steps of 0.5. 
Fig. 16 contains a plot of log/(x) versus logx with logXbend = 1-0 and 7 = 3. 
The blue curves have 2 < 5 < 3 (i.e. 5 — 7 < 0) and the red curves have 

3 < 5 < 10 (i.e. 5 — 7 > 0). The more black the color of these curves, the 
closer 5 is to 7. 

Fig. 17 contains a plot of TP(u) for the distributions plotted in Fig.16. As 
noted earlier, TP(u) > if and only if 5 — 7 > and TP(u) < if and only if 
5—7 < 0. The colored points on these curves show where each curve maximally 
deviates from zero; the coordinates of these points are {u ,TP(u )} for each 
curve (see equation (16)). These points show a weak dependence of log^o on 
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TP(u) 




Fig. 15. A plot of TP(u) (see equation (15)) for each of the distributions plotted in 
Fig. 14. Those distributions with tail-suppression (red) have TP(u) > and those 
with tail-enhancement (blue) have TP(u) < 0. 



log f (X) 




Fig. 16. A plot of log/(x) (see equation (14)) versus logx with logXbend = 1-0 and 
7 = 3. The blue curves have 2 < 5 < 3 (i.e. 6 — 7 < 0) and the red curves have 
3 < 5 < 10 (i.e. 5 — 7 > 0). The more black the color of these curves, the closer 5 is 
to 7. 

5, for a given \ogx bend . 

The value of the maximum deviation TP(uq) also shows dependence on 5. In 
Fig. 18 we plot TP{u ) versus 5 — 7 for each of the points in Fig. 17. These 
plots suggest the following: 

• For — 1 < 5 — 7<1 (blue and black), a small change in 5 will lead to a 
large change in TP(uq). 

• If S — 7 ^> 1 (bright red), however, a large change in 5 will result in a small 
change in TP(« ). This case is of particular interest since a large 5 will 
mimic the cutoff distribution denned in equation (11). 

• By inspection of Fig. 18 we note that TP(w ) ~ 0.025 for 8 — 7 ^> 1. 
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TP(u) 




Fig. 17. A plot of TP(u) (see equation (15)) for the distributions plotted in Fig. 16. 
The colored points on these curves show where each curve maximally deviates from 
zero; the coordinates of these points are {uo,TP(uq)} for each curve (see equation 
(16)). 

• Comparison with Fig. 15 suggests that the limiting value of TP(uq) is roughly 
independent of Xbend- 

TP(u ) 




6—y 



Fig. 18. A plot of TP(uq) (see equations (14) and (16)) versus 5 — 7 for each of the 
points in Fig.17. Note that TP(u ) xs 0.025 for S- 7 > 1. 

The studies described in this section show that the TP-statistic can distinguish 
tail-suppressed (5 — 7 > 0) from tail-enhanced (5 — 7 < 0) distributions, i.e. 
TP{u) > if and only if S - 7 > and TP{u) < if and only if S - 7 < 0. 
Furthermore, they show that in the limiting case of 5 — 7 ^> 1 the most 
important parameter in determining w is %bend but that the limiting value of 
TP(u ) is roughly independent of Xb en d and 5 — 7. 
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